Skip to content

Conversation

@pralay-das
Copy link

@pralay-das pralay-das commented Oct 9, 2025

feat:

  • supported fused RoPE in flash attention
  • Use GMEM data (read & write) for RoPE calculation

used #498 PR as a reference for chunk prefill.

taozha2 and others added 12 commits October 9, 2025 10:38
This change imports `SYCLCompat` to cutlass-sycl repo as `compat`.
Previous dependencies on `syclcompat` are changed to `compat`.
This PR also fix some failures of `SYCLCompat` in oneapi 2025.2.

---------

Co-authored-by: Roland Schulz <[email protected]>
1. This version will compute RoPE on GMEM data
@pralay-das pralay-das force-pushed the dev/pralay/chunk_prefill_rope_on_gmem branch from 2b3344a to ce0bbf2 Compare October 9, 2025 10:39
@pralay-das pralay-das changed the title [PYTORCHDGQ-6865] Added support for RoPE on chunk prefill [PYTORCHDGQ-6865] Added support for RoPE on chunk prefill [WIP] Oct 9, 2025
@Antonyvance Antonyvance added the redesign required Implementation require a redesign label Oct 17, 2025
@pralay-das
Copy link
Author

duplicate PR: #569

@pralay-das pralay-das closed this Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

redesign required Implementation require a redesign

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants